Temporal Difference Approach to Playing Give-Away Checkers
نویسندگان
چکیده
In this paper we examine the application of temporal difference methods in learning a linear state value function approximation in a game of give-away checkers. Empirical results show that the TD(λ) algorithm can be successfully used to improve playing policy quality in this domain. Training games with strong and random opponents were considered. Results show that learning only on negative game outcomes improved performance of the learning player against strong opponents.
منابع مشابه
An Emergence Of Game Strategy In Multiagent Systems
In this paper, we study an emergence of game strategy in multiagent systems. Symbolic and subsymbolic approaches are compared. Symbolic approach is represented by a backtrack algorithm with specified search depth, whereas the subsymbolic approach is represented by feedforward neural networks that are adapted by reinforcement temporal difference TD(lambda) technique. As a test game, we used simp...
متن کاملEvolutionary-based heuristic generators for checkers and give-away checkers
Two methods of genetic evolution of linear and non-linear heuristic evaluation functions for the game of checkers and give-away checkers are presented in the paper. The first method is based on the simplistic assumption that a relation ‘close’ to partial order can be defined over the set of the evaluation functions. Hence explicit fitness function is not necessary in this case and direct compar...
متن کاملGOjen: tdGo Temporal Difference Learning of Go Playing Artificial Neural Networks
The original project description has been: An existing Java application handling and visualizing Go games between human and computer players (including trained and evolved ANNs) should be improved and extended with Go playing ANNs trained by temporal difference learning. This extension should serve as a basis for comparisons of td learning with conventional ANN training and evolutionary methods...
متن کاملComparison of TDLeaf(λ) and TD(λ) Learning in Game Playing Domain
In this paper we compare the results of applying TD(λ) and TDLeaf(λ) algorithms to the game of give-away checkers. Experiments show comparable performance of both algorithms in general, although TDLeaf(λ) seems to be less vulnerable to weight over-fitting. Additional experiments were also performed in order to test three learning strategies used in self-play. The best performance was achieved w...
متن کاملDistributed Decision Making in Checkers
The game of checkers can be played by machines running either heuristic search algorithms or complex decision making programs trained using machine learning techniques. The rst approach has been used with remarkable success. The latter approach yielded encouraging results in the past, but later results were not so useful, partly because of the limitations of current machine learning algorithms....
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004